-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added OLMo support to builder.py #1061
base: main
Are you sure you want to change the base?
Added OLMo support to builder.py #1061
Conversation
…y fake layernorm Co-authored-by: Tim Costigan <[email protected]> Co-authored-by: Tim Costigan <[email protected]>"
… and set then in our override Co-authored-by: Tim Costigan <[email protected]> Co-authored-by: Tim Costigan <[email protected]>
@microsoft-github-policy-service agree company="AMD" |
@kunal-vaishnavi ptal, thanks! |
Thanks for the contribution! Does OLMo run end-to-end with the ONNX Runtime GenAI tokenizer? Can you also update the following places?
onnxruntime-genai/test/python/_test_utils.py Lines 55 to 77 in 0f59a90
The models in
You can add it to |
# Conflicts: # src/python/py/models/builder.py
…exist, which caused errors in model_qa.py
That is be updated as requested now. It runs end to end and I've also added Qwen to the CI list. |
Thank you for adding the changes. The end-to-end tests in the CIs appear to be failing due to the
|
This should be good to go! |
After some further investigation, it appears that the tokenizer CI failure is happening because the tokenizer for OLMo is not currently supported in ONNX Runtime Extensions. Once the support is added, the main branch of ONNX Runtime GenAI can be merged into this PR to integrate the changes. |
The ONNX Runtime Extensions PR has been merged now. You can update its commit ID in onnxruntime-genai/cmake/deps.txt Line 17 in 41c2543
|
…into shobrien/add-olmo-builder-support # Conflicts: # README.md # src/models/model.cpp # test/python/_test_utils.py
Hi @kunal-vaishnavi thanks for letting me know, I've bumped that dependency. |
src/models/model.cpp
Outdated
@@ -590,7 +590,7 @@ std::shared_ptr<Model> CreateModel(OrtEnv& ort_env, const char* config_path, con | |||
} | |||
|
|||
std::shared_ptr<Model> CreateModel(OrtEnv& ort_env, std::unique_ptr<Config> config) { | |||
std::set<std::string> llm_types = {"chatglm", "decoder", "gemma", "gemma2", "granite", "llama", "mistral", "nemotron", "phi", "phimoe", "phi3", "phi3small", "qwen2"}; | |||
std::set<std::string> llm_types = {"chatglm", "decoder", "gemma", "gemma2", "granite", "llama", "mistral", "nemotron", "phi", "phimoe", "phi3", "phi3small", "qwen2", "olmo"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::set<std::string> llm_types = {"chatglm", "decoder", "gemma", "gemma2", "granite", "llama", "mistral", "nemotron", "phi", "phimoe", "phi3", "phi3small", "qwen2", "olmo"}; | |
std::set<std::string> llm_types = {"chatglm", "decoder", "gemma", "gemma2", "granite", "llama", "mistral", "nemotron", "olmo", "phi", "phimoe", "phi3", "phi3small", "qwen2"}; |
src/python/py/models/builder.py
Outdated
elif config.architectures[0] == "NemotronForCausalLM": | ||
onnx_model = NemotronModel(config, io_dtype, precision, execution_provider, cache_dir, extra_options) | ||
elif config.architectures[0] == "ChatGLMForConditionalGeneration" or config.architectures[0] == "ChatGLMModel": | ||
# Quantized ChatGLM model has ChatGLMForConditionalGeneration as architecture whereas HF model as the latter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be a merge conflict issue. Nemotron and ChatGLM are checked earlier so they don't need to be checked again here. Can you also insert OLMo into the checks so that the alphabetical order is still maintained?
No description provided.